Simplifying the reversed duplicate removal procedure

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed Duplicate Removal

The distributed duplicate removal problem is concerned with the detection and subsequent elimination of all duplicate elements in a given multiset that is distributed over several computers connected by a network. Sanders et al. [48] outline a communication efficient algorithm solving this problem. It uses distributed compressed single shot Bloom filters to identify distinct elements using mini...

متن کامل

Duplicate Removal in Information Dissemination

Our experience with the SIFT [YGM95] information dissemination system (in use by over 7,000 users daily) has identi ed an important and generic dissemination problem: duplicate information. In this paper we explain why duplicates arise, we quantify the problem, and we discuss why it impairs information dissemination. We then propose a Duplicate RemovalModule (DRM) for an information disseminati...

متن کامل

Duplicate Removal for Candidate Answer Sentences

In this paper, we describe the duplicate removal component of Infolab’s1 question answering system that contributed to CSAIL’s entry of TREC-152 Question Answering track. The goal of the Question Answering Track is to provide short, succinct answers to English sentences posed by users. In answering definition questions, we are asked to retrieve new and relevant information, in the form of short...

متن کامل

Motion analysis for duplicate frame removal in wireless capsule endoscope

Wireless capsule Endoscopy (WCE) has rapidly shown its wide applications in medical domain last ten years thanks to its noninvasiveness for patients and support for thorough inspection through a patient’s entire digestive system including small intestine. However, one of the main barriers to efficient clinical inspection procedure is that it requires large amount of effort for clinicians to ins...

متن کامل

SEAL: a distributed short read mapping and duplicate removal tool

SUMMARY SEAL is a scalable tool for short read pair mapping and duplicate removal. It computes mappings that are consistent with those produced by BWA and removes duplicates according to the same criteria employed by Picard MarkDuplicates. On a 16-node Hadoop cluster, it is capable of processing about 13 GB per hour in map+rmdup mode, while reaching a throughput of 19 GB per hour in mapping-onl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the American Society for Information Science and Technology

سال: 2003

ISSN: 1532-2882,1532-2890

DOI: 10.1002/asi.10199